智能论文笔记

PointCaps: Raw Point Cloud Processing using Capsule Networks with Euclidean Distance Routing

Dishanika Denipitiyage , Vinoj Jayasundara , Ranga Rodrigo , Chamira U. S. Edussooriya

分类：计算机视觉

2021-12-21

使用胶囊网络的原始点云处理在分类，重建和分割中被广泛采用，因为它能够保留输入数据的空间协议。然而，基于现有的大多数基于胶囊的网络方法是计算繁重的，并且在将整个点云作为单个胶囊代表整个点云。我们通过提出具有参数共享的小说卷积胶囊架构，通过提出Pointcaps来解决现有的胶囊网络基础方法的这些限制。除了点击措施之外，我们提出了一种新颖的欧几里德距离路由算法和独立于独立的潜在潜在表示。潜在的表示捕获了点云的物理解释的几何参数，具有动态欧几里德路由，Pointcaps阱 - 代表点的空间（点对部分）关系。 Pointcaps的参数具有显着较低的参数，并且需要显着较低的拖鞋，同时实现与最先进的胶囊网络相比，对原始点云的可比分类和分割精度实现更好的重建。

translated by 谷歌翻译

PatchGame: Learning to Signal Mid-level Patches in Referential Games

Kamal Gupta , Gowthami Somepalli , Anubhav Gupta , Vinoj Jayasundara , Matthias Zwicker , Abhinav Shrivastava

分类：计算机视觉 | 机器学习

2021-11-02

我们研究了参考游戏（一种信令游戏），其中两个代理通过离散瓶颈互相通信，以实现共同的目标。在我们的参照游戏中，扬声器的目标是撰写消息或符号表示“重要的”图像修补程序，而侦听器的任务是将扬声器的消息与相同图像的不同视图匹配。我们表明，这两个代理确实可以在不明确或隐含监督的情况下开发通信协议。我们进一步调查了开发的协议，并通过仅使用重要补丁来展示加速最近的视觉变压器的应用程序，以及用于下游识别任务的预训练（例如，分类）。代码在https://github.com/kampta/patchgame提供。

translated by 谷歌翻译

TimeCaps: Capturing Time Series Data With Capsule Networks

Hirunima Jayasekara , Vinoj Jayasundara , Mohamed Athif , Jathushan Rajasegaran , Sandaru Jayasekara , Suranga Seneviratne , Ranga Rodrigo

分类：机器学习 | 人工智能 | (统计)机器学习

2019-11-26

胶囊网络在了解与视觉相关任务的2D数据中的空间关系方面表现出色。即使它们并非旨在捕获一维时间关系，但在时间表中，我们证明了鉴于能力，胶囊网络在理解时间关系方面表现出色。为此，我们沿时间和频道尺寸生成胶囊，从而创建两个时间特征检测器，以学习对比关系。时间代表通过在识别13个心电图（ECG）信号拍打类别方面达到96.21％的精度，超过了最新结果，同时在确定30类短音频命令时获得了AN-PAR结果。此外，胶囊网络固有学到的实例化参数使我们能够完全参数化1D信号，从而在信号处理中打开各种可能性。

translated by 谷歌翻译

GAUSS: Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness

Yasiru Ranasinghe , Kavinga Weerasooriya , Roshan Godaliyadda , Vijitha Herath , Parakrama Ekanayake , Dhananjaya Jayasundara , Lakshitha Ramanayake , Neranjan Senarath , Dulantha Wickramasinghe

分类：计算机视觉

2022-04-16

In recent hyperspectral unmixing (HU) literature, the application of deep learning (DL) has become more prominent, especially with the autoencoder (AE) architecture. We propose a split architecture and use a pseudo-ground truth for abundances to guide the `unmixing network' (UN) optimization. Preceding the UN, an `approximation network' (AN) is proposed, which will improve the association between the centre pixel and its neighbourhood. Hence, it will accentuate spatial correlation in the abundances as its output is the input to the UN and the reference for the `mixing network' (MN). In the Guided Encoder-Decoder Architecture for Hyperspectral Unmixing with Spatial Smoothness (GAUSS), we proposed using one-hot encoded abundances as the pseudo-ground truth to guide the UN; computed using the k-means algorithm to exclude the use of prior HU methods. Furthermore, we release the single-layer constraint on MN by introducing the UN generated abundances in contrast to the standard AE for HU. Secondly, we experimented with two modifications on the pre-trained network using the GAUSS method. In GAUSS$_\textit{blind}$, we have concatenated the UN and the MN to back-propagate the reconstruction error gradients to the encoder. Then, in the GAUSS$_\textit{prime}$, abundance results of a signal processing (SP) method with reliable abundance results were used as the pseudo-ground truth with the GAUSS architecture. According to quantitative and graphical results for four experimental datasets, the three architectures either transcended or equated the performance of existing HU algorithms from both DL and SP domains.

translated by 谷歌翻译

KORSAL: Key-point Detection based Online Real-Time Spatio-Temporal Action Localization

Kalana Abeywardena , Shechem Sumanthiran , Sakuna Jayasundara , Sachira Karunasena , Ranga Rodrigo , Peshala Jayasekara

分类：计算机视觉

2021-11-05

视频中的实时和在线行动本地化是一个关键但极具挑战性的问题。准确的行动定位需要利用时间和空间信息。最近的尝试通过使用计算密集的3D CNN架构或高度冗余的双流架构来实现这一目标，使它们既不适用于实时在线应用程序。为了在高度挑战的实时约束下完成活动本地化，我们提出利用基于快速高效的关键点的边界框预测到空间本地化动作。然后，我们介绍一种管链接算法，其在闭塞存在下在时间上保持动作管的连续性。此外，我们通过将时间和空间信息与级联输入组合到单个网络的级联输入来消除对双流架构的需要，允许网络从两种类型的信息中学习。使用结构相似索引图有效地提取了时间信息，而不是计算密集的光学流量。尽管我们的方法简单，我们的轻质端到端架构在挑战的UCF101-24数据集上实现了最先进的框架地图，达到了74.7％，展示了以前最好的在线方法的性能增益为6.4％。与在线和离线方法两者相比，我们还实现了最先进的视频地图结果。此外，我们的模型实现了41.8 FPS的帧速率，这是对当代实时方法的10.7％。

translated by 谷歌翻译